Bi-affinity filter: a bilateral type filter for color images

ABSTRACT

An edge preserving filter that works on the principle of matting affinity allows a better representation of the range filter term in bilateral class filters. The definition of the affinity term can be relaxed to suit different applications. An approximate bi-affinity filter whose output is shown to be very similar to the traditional bilateral filter is defined. The present technique has the added advantage that no color space changes are required and hence an input image can be handled in its original color space. This is a big benefit over the traditional bilateral filter, which needs conversion to perception based spaces, such as CIELAB, to generate results close to the present invention. The full bi-affinity filter preserves very minute details of the input image, and thus permits an enhanced zooming functionality.

BACKGROUND

1. Field of Invention

The present invention pertains to the field of image processing. More specifically, it pertains to the field of image smoothing by bilateral filtering.

2. Description of Related Art

The bilateral filter was originally proposed by Tomasi and Manduchi in Bilateral Filtering for Gray and Color Images, ICCV: Proceedings of the Sixth International Conference on Computer Vision, page 839, 1998, IEEE Computer Society, herein incorporated in its entirety by reference. Basically, the bilateral filter smooths an image while preserving edges by means of a nonlinear combination of nearby image values. The principle idea behind such a filtering operation is to combine information from the spatial domain as well as from a feature domain. It can be represented as

$\begin{matrix} {{h(x)} = {\frac{1}{k(x)}{\sum\limits_{y \in \Omega_{x}}{{f_{s}\left( {x,y} \right)}{g_{r}\left( {{I(x)},{I(y)}} \right)}{I(y)}}}}} & (1) \end{matrix}$ where I and h are the input and output images respectively, x and y are pixel locations over an image grid, Ω_(x) is the neighborhood induced around the central pixel x, f_(s)(x,y) measures the spatial affinity between pixels at x and y (i.e. the spatial domain, and may be thought of as a spatial filter) and g_(r)(I(x),I(y)) denotes the feature/measurement/photometric affinity (i.e. the feature domain, and may be thought of as a range filter). Parameter k(x) is the normalization term given by

$\begin{matrix} {{k(x)} = {\sum\limits_{y \in \Omega_{x}}{{f_{s}\left( {x,y} \right)}{g_{r}\left( {{I(x)},{I(y)}} \right)}}}} & (2) \end{matrix}$ The spatial and range filters (f and g, respectively), are commonly set to be Gaussian filters, as follows:

$\begin{matrix} {{f_{s}\left( {x,y} \right)} = {\exp\left( \frac{- {{x - y}}_{2}^{2}}{2\;\sigma_{s}^{2}} \right)}} & (3) \\ {{g_{r}\left( {u,v} \right)} = {\exp\left( \frac{- {{u - v}}_{2}^{2}}{2\;\sigma_{r}^{2}} \right)}} & (4) \end{matrix}$ parameterized by the variances σ_(s) and σ_(r). The range filter, g, penalizes distance in the feature space and hence the filter has an inherent edge preserving property. Due to this property, the bilateral filter, as an edge-preserving filter, has been one of the most widely used filtering techniques within computer vision community.

The bilateral filter is a non-linear filter (making it a computationally intensive operation) and, as such, many researchers have proposed techniques to decompose the non-linear, bilateral filter into a sum of separate one-dimensional filters or similar cascaded representations. Singular value decomposition of the 2D kernel is one such technique. Another proposed technique is an approximation of the bilateral filter by filtering sub-sampled copies of an image with discrete intensity kernels, and recombining the results using linear interpolation.

Recently the run-time of the bilateral filter has been identified as a critical bottleneck, and a few techniques have been proposed that render the filtering operation at an almost constant time, albeit with larger space requirements and behavioral approximations.

The research into improving the performance of the bilateral filter heavily relies on the form of the filter, which is applied in a range domain as well as the spatial domain. One method can be entirely broken down to an approximation of a product of a box filter for smoothing and a polynomial or fourth order Taylor series approximation of a Gaussian kernel. However, when the form of the filter is changed, such methods cannot be applied for the promised increased in speed.

It is an object of the present invention to provide a filter that provides similar image smoothing and edge preserving properties as a bilateral filter, but which is less constrained by nonlinear combination of image values.

It is a further object to provide of the present invention to provide a filter that provides similar image smoothing and edge preserving properties as a bilateral filter, but which lends itself to increased implementation optimizations.

It is a further object of the present invention to provide a filter that not only provides similar image smoothing and edge preserving properties as a bilateral filter, but also provides additional image optimization properties.

SUMMARY OF INVENTION

The above objects are met in method of smoothing and enhancing an input image by using color line techniques to determine the color affinity of pixels within their native color space.

More specifically, the above objects are met in a method of processing a digital input image, comprising: providing the digital input image, I, in physical memory, the input image having color information in a native color space for each image pixel; providing a processing device coupled to the physical memory to access the input image I and create an output image, h, by implementing the following steps:

(a) applying the following sequence to the input image I,

${h_{\sigma,\mspace{14mu} ɛ}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$

where x and y are pixel locations over an image grid, and Ω_(x) (or equivalently window w) is the neighborhood induced around the central pixel x; f_(s)(x,y) measures the spatial affinity between pixels at x and y; parameter σ controls an amount of spatial blurring; L_(xy) ^(ε) is a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; parameter ε is a regularization term whose relative weight determines the smoothness of output image h.

In this method, step (a) is applied to the input image I in its native color space, which is preferably the RGB color space.

The laplacian matrix L_(xy) ^(ε) follows a 2 color model specifying that pixel color I_(i) in an input image I can be represented as a linear combination of two colors P and S as follows: I_(i)=α_(i)P+(1−α_(i))S, ∀iεw, 0≦α_(i)≦1, where the two colors are piecewise smooth and can be derived from local properties within neighborhood Ω_(x) containing the pixel i, and wherein piecewise smooth means smooth over a small region, i.e. a region similar in size as window w. It is noted that parameter ε determines the smoothness of the decomposition into the two color modes as represented by the smoothness of α estimates. In this approach, α_(i) is defined as:

${\alpha_{i} = {{\sum\limits_{c}{a^{c}I_{i}^{c}}} + b}},{\forall{i\; \in \; w}},{c \in \left\{ {1,2,3} \right\}}$ where a are weights for each color channel, which is constant for the window w, b is a model parameter, and c is an index that is unique for each window over the color channels. Preferably, the α values are determined according to quadratic cost function J(α)=α^(T)Lα.

In this case, the laplacian matrix L is an n×n matrix, where n is the total number of pixels in input image I whose (ij)^(th) element is given by

$\sum\limits_{k|{{({i,j})} \in w_{k}}}\left( {\delta_{ij} - {\frac{1}{\left| w_{k} \right|}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{\left| w_{k} \right|}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}} \right)$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the identity matrix of size 3×3.

The laplacian matrix L may be further decomposed into a diagonal matrix D and a weight matrix W with the formulation L=D−W, wherein: diagonal matrix D is defined with terms D_(ii)=#[k|iεw_(k)] at its diagonal and represents the cardinality of the number of windows of which the pixel i is a member; individual terms of weight matrix W are given by

$W_{ij} = {\sum\limits_{k|{{({i,j})} \in w_{k}}}{\frac{1}{\left| w_{k} \right|}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{\left| w_{k} \right|}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the identity matrix of size 3×3; and color affinity W_(ij) for two pixels with the same color is a positive quantity varying with the homogeneity of the local windows containing the pixels i and j, and color affinity for pixels with different color is zero.

Additionally at the local minima, the solution α* satisfies a first order optimality condition L^(T)α*=0, and the optimal condition is defined as L^(T)α*=(D−W)^(T)α*.

The terms of W_(i,j) may be evaluated by counting the contribution of only the center pixel's local window defined as the window centered about it. Preferably, the local window w is a 3×3 pixel window requiring O(w²) computations.

In another embodiment of the present invention, the method further includes steps of: (b) interpolating the input image I to a resolution higher than its native resolution; and (c) combining the results of step (a) with the interpolated, higher resolution image of step (b). In this approach, step (a) generates detail line information of the input image I, and step (c) adds this line information to the interpolated, higher resolution image.

In the above examples, the laplacian matrix L acts like a zero-sum filter kernel.

The present invention may further be applied to a method for enlarging an input image I. This method preferably includes: providing the digital input image, I, in physical memory, the input image having color information in a native color space for each image pixel; providing a processing device coupled to the physical memory to access the input image I and create an output image, h, by implementing the following steps: (a) applying the following sequence to the input image I,

${h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y\; \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y\; \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$ where: (i) x and y are pixel locations over an image grid; (ii) Ω_(x) is the neighborhood induced around the central pixel x; (iii) f_(s)(x,y) measures the spatial affinity between pixels at x and y; (iv) parameter σ controls an amount of spatial blurring; (v) L_(xy) ^(ε) is a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; (vi) parameter ε is a regularization term whose relative weight determines the smoothness of output image h; (vii) laplacian matrix L is defined as a diagonal matrix, D, and a weight matrix, W, with the formulation L=D−W, where diagonal matrix D is further defined as D_(ii)=#[k|iεw_(k)] at its diagonal, which represents the cardinality of the number of windows of which the pixel i is a member, individual terms of the weight matrix W are given by

$W_{ij} = {\sum\limits_{k|{{({i,j})} \in \; w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ and weight matrix W_(ij) is evaluated over all possible overlapping windows that contain the center pixel; and (b) interpolating the input image I to a resolution higher than its native resolution; (c) combining the results of step (a) with the interpolated, higher resolution image of step (b).

In this approach, step (b) generates detail line information of the input image I, and step (c) adds this line information to the interpolated, higher resolution image.

Preferably, step (a) is applied to the input image I in its native color space, which is the RGB color space.

The present invention may further be applied in a method of smoothing an input image I, having: providing the digital input image, I, in physical memory, the input image having color information in a native color space for each image pixel; providing a processing device coupled to the physical memory to access the input image I and create an output image, h, by implementing the following steps: (a) applying the following sequence to the input image I,

${h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y\; \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$ where: (i) x and y are pixel locations over an image grid; (ii) Ω_(x) is the neighborhood induced around the central pixel x; (iii) f_(s)(x,y) measures the spatial affinity between pixels at x and y; (iv) parameter σ controls an amount of spatial blurring; (v) L_(xy) ^(ε) is a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; (vi) parameter ε is a regularization term whose relative weight determines the smoothness of output image h; (vii) laplacian matrix L is defined as a diagonal matrix, D, and a weight matrix, W, with the formulation L=D−W, where diagonal matrix D is further defined as D_(ii)=#[k|iεw_(k)] at its diagonal, which represents the cardinality of the number of windows of which the pixel i is a member, individual terms of the weight matrix W are given by

$W_{ij} = {\sum\limits_{k|{{({i,j})} \in \; w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ and terms of W_(i,j) are evaluated by counting the contribution of only the center pixel's local window defined as the window centered about it.

Preferably, the local window w is a 3×3 pixel window. Further preferably, regularization factor of ε is at last 0.1.

Like before, the present method may be applied directly to the input image I in its native color space, which typically is the RGB color space.

Other objects and attainments together with a fuller understanding of the invention will become apparent and appreciated by referring to the following description and claims taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings wherein like reference symbols refer to like parts.

FIG. 1 is an RGB color histogram of real world, natural, color images.

FIG. 2 illustrates a sample setup for implementing the present Bi-affinity filter.

FIG. 3 a illustrates an original image with sharp edges.

FIG. 3 b illustrates an edge-rounding effect of applying the bilateral filter on the original image of FIG. 3 a.

FIG. 3 c illustrates an edge-preserving effect of applying the Bi-affinity filter on the original image of FIG. 3 a.

FIG. 4 a shows a center pixel C1 and nine overlapping 3×3 pixel windows that include center pixel C1.

FIG. 4 b the only four overlapping windows that encompass both center pixel C1 and target neighbor pixel N1.

FIG. 4 c shows only the center pixel's local window, which is the window centered about C1.

FIG. 5 illustrates the affect of varying regularization term ε pixel windows of varying pixel size.

FIG. 6 shows the effect of regularization term ε for a fixed window size, as regularization term ε assigned values of 0.0005, 0.005, 0.05, and 0.5.

FIGS. 7 a and 7 b show comparisons of the PSNR of the present Bi-affinity filter applied in an image's native RGB color space versus the bilateral filter applied in both the image's native RGB color space and converted CIELAB color space are shown for a fixed range filter variance and a varied window size.

FIG. 8 shows compares the results of applying the present Bi-affinity filter to an input image with the results of applying a bilateral filter to the input image in its native RGB color domain and after being converted to the CIELAB color space.

FIG. 9 shows an additional set of experimental results similar to those of FIG. 8 for comparing the Bi-affinity filter to the bilateral filter in both the native RGB color space and in the CIELAB color space.

FIG. 10 shows an additional set of experimental results similar to those of FIG. 8 for comparing the Bi-affinity filter to the bilateral filter in both the native RGB color space and in the CIELAB color space.

FIGS. 11 a, 11 b, and 11 c show the application of the present Bi-affinity filter to enhance a low resolution image.

FIG. 12 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation.

FIG. 12 b shows the same zoomed-in image of FIG. 12 a additionally enhanced by the present Bi-affinity filter.

FIG. 13 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation.

FIG. 13 b shows the zoomed-in image of FIG. 13 a additionally enhanced by the present Bi-affinity filter.

FIG. 14 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation.

FIG. 14 b shows the zoomed-in image of FIG. 14 a additionally enhanced by the present Bi-affinity filter.

FIGS. 15 a and 15 b show a graphical model for image enhancement.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

It is presently proposed that the true power of the bilateral filter is yet to be realized owing to the fact that very few family of kernels have been tried with the bilateral paradigm.

Traditionally, research has overlooked one of the most important shortfalls of the bilateral filter, which is a unified handling of multi-channel color images. This is due to the fact that color channels are traditionally assumed to be independent of each other, and the filter therefore processes each color channel independently. For example, an RGB color space, which has three color channels (red, green, and blue), requires separate application of the bilateral filter to each of three color channels. As a direct consequence of this, the bilateral filter produces color artifacts at sharp color edges in the RGB color space. This poses a problem since the RGB color space is typically the native color space of color digital images, i.e. images obtained through various digital imaging techniques, such as digital photography or image scanning operations.

One remedy previously proposed is to convert from the native RGB color space to the CIELAB color space, which attempts to approximate human vision (or perception). Like the RGB color space, the CIELAB color space also has three color channels, and thus still requires that the bilateral filter be applied independently to each of three color channels, but the CIELAB is noted for avoiding color artifacts. That is, according to this approach, once the native RGB color image is converted to the CIELAB color space, then channel wise bilateral filter processing does not produce the artifacts apparent from processing in the RGB color space.

The present discussion further investigates this weakness and proposes a new technique that works at par with the transformed domain techniques (i.e. converting from a native RGB color space to an alternate color space) that have been the standard practices within the digital imaging community, thus far.

A new filter, hereafter called a “Bi-affinity filter”, for color images is presently proposed. This filter is similar in structure to a bilateral filter, but the proposed Bi-affinity filter is based on a color line model. As will be shown, this approach permits the elimination of the explicit conversion from the RGB color space to perception based color spaces, such as CIELAB.

The present Bi-affinity filter measures the color affinity of a pixel to a small neighborhood around it and weighs the filter term accordingly. It is put forth that this method can perform at par with standard bilateral filters for color images. An added benefit of the present Bi-affinity filter is that not only does it preserve big edges of an input image (a desired characteristic of the bilateral filter), but also preserves and enhances small edges of the input image, which leads to additional applications as an image enhancement filter.

As is explained above, the principle objective of the bilateral filter is to combine information from the spatial domain with a feature domain, and this feature domain is generally the color domain. Determining color affinity between pixels in color images in the RGB domain is complicated by differences in light intensity causing pixels that are close in color to appear quite different and thus have markedly different RGB values. It is therefore generally preferred to convert from the RGB color space to an alterante color space better capable to filtering out difference in light intensity, and apply the bilateral filter in the alternate color space.

Many alternative color spaces have been suggested to separate color from light intensity and create methods for determining whether two pixels share the same color, or not. These color spaces can be divided into two groups, namely linear and non-linear. Among the linear color spaces, the most widely used are the YCrCb, YUV, and YIQ color spaces. Among the non-linear color spaces, two sub-groups are popular: a first sub-group that separates color into hue (i.e. color), saturation (i.e. purity), and value (i.e. light intensity), and a second sub-group that separates color into luminance plus two color coordinates. The first group may further include the HSV, HSI, HSB, HSL, etc. color spaces. The second group are the CIELAB and CIE-LUV color spaces that separate color into luminance (i.e. light intensity) and two color coordinates in an effort to create a color space that is perceptually uniform (i.e. close to how the human eye perceives color variations).

To determine whether two pixels have the same real world color, or not, the color coordinates of a generic color model are used. These color models assume either that there is no color distortion or that there is an identical color distortion for all imaging conditions. In practice, when dealing with real world images of an unknown source, these assumptions are rarely true as scene surface color is distorted differently in different images as well as different image regions, depending on the scene and camera settings.

This leads to the topic of the color line model. The introduction of color lines has been attributed to Omer et al. in Color lines: Image Specific Color Representation, CVPR, 2004, herein incorporated in its entirety by reference. Omer et al. proposed that color natural images form clusters of pixel colors in the RGB color space. These clusters of pixels appear to form mostly tubular regions, due to the fact that most small regions in natural images can be decomposed into a linear combination of 2 colors. This lead to the two-color model, described below.

With reference to FIG. 1, when looking at the RGB color histogram 10 of real world, natural images, it can be clearly observed that the histogram is very sparse and structured. The color line model exploits these two properties of RGB color histograms by describing the elongated color clusters, or tubular regions, 11-15. Since the structure and arrangement of the color clusters 11-15 are specific to an image, this results in an image-specific color representation that has two important properties: robustness to color distortion and a compact description of colors in an image. This idea has been used for image matting (i.e. separation of a foreground subject from its surrounding background), Bayer demosaicing (i.e. reconstruction of a color image from incomplete color samples output from an image sensor overlaid with a Bayer color filter array), and more recently for image de-noising and de-blurring.

The two-color model states that any pixel color I_(i) in an image I can be represented as a linear combination of two colors P and S, where these colors are piecewise smooth and can be derived from local properties within a small neighborhood containing the pixel i. I _(i)=α_(i) P+(1−α_(i))S, ∀iεw, 0≦α_(i)≦1  (5) where w is a small patch, or window, and α is the matting coefficient. The patch size is a key parameter in this two-color model, as it is true for only small neighborhoods, relative to the resolution and size of the image. These “small” neighborhoods are typically defined as 3×3 pixel neighborhoods. As the resolution and the size of the images grow, so should the window size as well, to capture a valid neighborhood.

As Levin et al. show in A Closed Form Solution to Natural Image Matting, CVPR, 2006 (herein incorporated in its entirety by reference), a similar line of reasoning may be used for color images. For color images, if the color line property is obeyed, then the 4D linear model satisfied by the matting coefficient, within a small window, at each pixel can be written as

$\begin{matrix} {{\alpha_{i} = {{\sum\limits_{c}{a^{c}I_{i}^{c}}} + b}},{\forall\;{i \in w}},{c \in \left\{ {1,2,3} \right\}}} & (6) \end{matrix}$ Where α are weights for each color channel, which is constant for the window w, b is a model parameter, and c is an index that is unique for each window over the color channels.

A cost function for evaluating the matting coefficient α can be formulated from this model. For an image with N pixels, the cost is defined as

$\begin{matrix} {{J\left( {\alpha,a,b} \right)} = {\sum\limits_{k \in N}\left( {{\sum\limits_{i \in w_{k}}\left( {\alpha_{i} - {\sum\limits_{c}{a_{k}^{c}I_{i}^{c}}} - b_{k}} \right)^{2}} + {ɛ{\sum\limits_{c}a_{k}^{c^{2}}}}} \right)}} & (7) \end{matrix}$ where w_(k) is a small window around pixel k and a={α_(i) ^(c)}, for all i=[1,N]. ε is a regularization weight for uniqueness as well as smoothness of the solution.

A first theorem is herein proposed, as follow.

Theorem 1: Let J(α)=min_(a,b)J(α,a, b), then J(α)=α^(T)Lα, where L is a laplacian matrix. As it is known in the art, and in particular in the mathematical field of graph theory, a laplacian matrix (also called an admittance matrix or Kirchhoff matrix, is a matrix representation of a graph. Together with Kirchhoff's theorem, it can be used to calculate the number of spanning trees for a given graph. In the present case, L is an n×n matrix, whose ij^(th) element is given by

$\begin{matrix} {\sum\limits_{k|{{({i,j})} \in \; w_{k}}}\left( {\delta_{ij} - {\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)^{T}{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{j} - \mu_{k}} \right)}}} \right)}} \right)} & (8) \end{matrix}$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, I_(i) and I_(j) are the color vectors at location i and j,

${{\overset{\sim}{\Sigma}}_{k} = {\Sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}}},$ where Σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the 3×3 identity matrix. Proof: Levin et. al. provide a limited proof based on an extension from a gray scale case. Below is presented a full 3-channel proof that can be readily extended to more channels, if necessary. Rewriting Eq. 7 in a matrix notation, where ∥•∥ denotes the 2-norm,

$\begin{matrix} {{{J\left( {\alpha,a,b} \right)} = {\sum\limits_{k}{{{G_{k} \cdot {\overset{\_}{a}}_{k}} - \alpha_{k}}}}}{where}} & (9) \\ {{G_{k} = \begin{bmatrix} I_{1}^{R} & I_{1}^{G} & I_{1}^{B} & 1 \\ \vdots & \vdots & \vdots & \vdots \\ I_{w_{k}}^{R} & I_{w_{k}}^{G} & I_{w_{k}}^{B} & 1 \\ \sqrt{ɛ} & 0 & 0 & 0 \\ 0 & \sqrt{ɛ} & 0 & 0 \\ 0 & 0 & \sqrt{ɛ} & 0 \end{bmatrix}},{{\overset{\_}{a}}_{k} = \begin{bmatrix} a_{k}^{R} \\ a_{k}^{G} \\ a_{k}^{B} \\ b \end{bmatrix}},{\alpha_{k} = \begin{bmatrix} \alpha_{1} \\ \vdots \\ \alpha_{w_{k}} \\ 0 \\ 0 \\ 0 \end{bmatrix}}} & (10) \end{matrix}$ Note that another representation of G_(k) is possible where the last 3 rows are combined to a single row of the form [√{square root over (ε)}√{square root over (ε)}√{square root over (ε)}0], but this form leads to an unstable covariance matrix. For known α_(k), the least square problem can be solved ā _(k)=argmin∥G _(k) ·ā _(k)−α_(k)∥  (11) =(G _(k) ^(T) G _(k))⁻¹ G _(k) ^(T)α_(k)  (12) Substituting this solution in Eq. 9, and denoting L_(k)=I_(|w) _(k) _(|+3)−G_(k)(G_(k) ^(T)G_(k))⁻¹G_(k) ^(T), where I_(|w) _(k) _(|+3) is the identity matrix of size (|w_(k)|+3), one obtains,

$\begin{matrix} \begin{matrix} {{J(\alpha)} = {\sum\limits_{k}{{L_{k}\alpha_{k}}}}} \\ {= {\sum\limits_{k}\left( {\alpha_{k}^{T}L_{k}^{T}L_{k}\alpha_{k}} \right)}} \end{matrix} & (13) \end{matrix}$ Making the additional observation that

$\begin{matrix} {{L_{k}^{T}L_{k}} = {\left( {I_{{w_{k}} + 3} - {{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}}} \right)^{T}\left( {I_{{w_{k}} + 3} - {{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}}} \right)}} \\ {= {I_{{w_{k}} + 3} + {{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}} - {2{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}}}} \\ {= {I_{{w_{k}} + 3} - {{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}G_{k}^{T}}}} \\ {= L_{k}} \end{matrix}$ one can write equation (13) as J(α)=Σ_(k)(α_(k) ^(T)L_(k)α_(k)). To complete the proof one needs to find the expression for L_(k)|_(i,j).

Noting the identity E[X²]=σ_(XX) ²+E[X]², denoting the individual channel means E[R] as R, one can write

$\begin{matrix} {{G_{k}^{T}G_{k}} = \mspace{675mu}(14)} \\ {\;{{w}\begin{bmatrix} \overset{\overset{A}{︷}}{\left( \begin{matrix} \begin{matrix} \begin{matrix} {\sigma_{RR}^{2} + R^{2} + \frac{ɛ}{w_{k}}} \\ {\sigma_{GR}^{2} + {GR}} \end{matrix} \\ {\sigma_{BR}^{2} + {BR}} \end{matrix} & \begin{matrix} \begin{matrix} {\sigma_{RG}^{2} + {RG}} \\ {\sigma_{GG}^{2} + G^{2} + \frac{ɛ}{w_{k}}} \end{matrix} \\ {\sigma_{BG}^{2} + {BG}} \end{matrix} & \left. \begin{matrix} \begin{matrix} {\sigma_{RB}^{2} + {RB}} \\ {\sigma_{GB}^{2} + {GB}} \end{matrix} \\ {\sigma_{BB}^{2} + B^{2} + \frac{ɛ}{w_{k}}} \end{matrix} \right) \end{matrix} \right.} & \overset{\overset{D}{︷}}{\begin{pmatrix} \begin{matrix} \underset{\;}{R} \\ \underset{\;}{\overset{\;}{G}} \end{matrix} \\ \overset{\;}{B} \end{pmatrix}} \\ \underset{\underset{D^{T}}{︸}}{\begin{matrix} \begin{matrix} \left( R \right. & \; & \; & \; & \mspace{11mu} & \; & \; & \; & \; & \; & G \end{matrix} & \; & \; & \; & \; & \mspace{11mu} & \mspace{11mu} & \; & \; & \; & \left. B \right) \end{matrix}} & \underset{\underset{C}{︸}}{\overset{\;}{1}} \end{bmatrix}}} \end{matrix}$ where the matrix has been divided into 4 components. Note that D=μ_(k) for the k^(th) window. The inverse of the above system can now be written as:

$\left( {G_{k}^{T}G_{k}} \right)^{- 1} = {\frac{1}{w_{k}}\begin{bmatrix} P & Q \\ R & S \end{bmatrix}}$ $\begin{matrix} {P = \left( {A - {{DC}^{- 1}D^{T}}} \right)^{- 1}} \\ {= \left( {A - {DD}^{T}} \right)^{- 1}} \\ {= \begin{bmatrix} {\sigma_{RR}^{2} + \frac{ɛ}{w_{k}}} & \sigma_{RG}^{2} & \sigma_{RB}^{2} \\ \sigma_{GR}^{2} & {\sigma_{GG}^{2} + \frac{ɛ}{w_{k}}} & \sigma_{GB}^{2} \\ \sigma_{BR}^{2} & \sigma_{BG}^{2} & {\sigma_{BB}^{2} + \frac{ɛ}{w_{k}}} \end{bmatrix}^{- 1}} \\ {= {\overset{\sim}{\Sigma}}_{k}^{- 1}} \end{matrix}$ $Q = {{- {P\left( {DC}^{- 1} \right)}} = {{- {PD}} = {{- {\overset{\sim}{\Sigma}}_{k}^{- 1}}\mu_{k}}}}$ $R = {{{- \left( {C^{- 1}D^{T}} \right)}P} = {{{- D^{T}}P} = {{- \mu_{k}^{T}}{\overset{\sim}{\Sigma}}_{k}^{- 1}}}}$ $S = {{C^{- 1} - {R\left( {DC}^{- 1} \right)}} = {{1 - {RD}} = {1 + {\mu_{k}^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}}}}}$ Putting all the terms together, one can write

$\begin{matrix} {\left( {G_{k}^{T}G_{k}} \right)^{- 1} = {\frac{1}{w_{k}}\begin{bmatrix} {\overset{\sim}{\Sigma}}_{k}^{- 1} & {{- {\overset{\sim}{\Sigma}}_{k}^{- 1}}\mu_{k}} \\ {{- \mu_{k}^{T}}{\overset{\sim}{\Sigma}}_{k}^{- 1}} & {1 + {\mu_{k}^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}}} \end{bmatrix}}} & (15) \\ {{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1} = {\frac{1}{w_{k}}\begin{bmatrix} {\left( {I_{1} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}} & {1 - {\left( {I_{1} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}}} \\ {\left( {I_{2} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}} & {1 - {\left( {I_{2} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}}} \\ \vdots & \vdots \\ {\left( {I_{w_{k}} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}} & {1 - {\left( {I_{w_{k}} - \mu_{k}} \right)^{T}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}}} \\ {\sqrt{ɛ}{\overset{\sim}{\Sigma}}_{k}^{- 1}} & {\sqrt{ɛ}{\overset{\sim}{\Sigma}}_{k}^{- 1}\mu_{k}} \end{bmatrix}}} & (16) \end{matrix}$ Right multiplication by G_(k) ^(T) yields the final symmetric form, where only the i^(th) column is shown for conciseness and ease of understanding

${{G_{k}\left( {G_{k}^{T}G_{k}} \right)}^{- 1}{G_{k}^{T}\left\lbrack {:{,i}} \right\rbrack}} = {\frac{1}{w_{k}}\begin{bmatrix} {1 + {\left( {I_{1} - \mu_{k}} \right)^{T}{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{i} - \mu_{k}} \right)}}} \\ {1 + {\left( {I_{2} - \mu_{k}} \right)^{T}{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{i} - \mu_{k}} \right)}}} \\ {1 + {\left( {I_{3} - \mu_{k}} \right)^{T}{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{i} - \mu_{k}} \right)}}} \\ \vdots \\ {1 + {\left( {I_{w_{k}} - \mu_{k}} \right)^{T}{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{i} - \mu_{k}} \right)}}} \\ {ɛ{{\overset{\sim}{\Sigma}}_{k}^{- 1}\left( {I_{i} - \mu_{k}} \right)}} \end{bmatrix}}$ Subtracting from I_(|w) _(k) _(|+3) and summing over k concludes the proof. Note that G_(k) has 3 extra rows (or C extra rows for general case) for the regularization ε. These can be neglected in the final expression since they do not explicitly effect the other computations.

As is stated above, laplacian matrix L, is an n×n matrix. It is note that in L, n may also be understood to be the total number of pixels in the image, whose (ij)^(th) element is given by

$\begin{matrix} {\sum\limits_{k❘{{({i,j})} \in w_{k}}}\;\left( {\delta_{ij} - {\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}} \right)} & (17) \end{matrix}$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the identity matrix of size 3×3. Note that the term ε is a regularization term and determines the smoothness of the decomposition into two dominant color modes (i.e. two colors P and S). The laplacian matrix defined by L has been termed the matting laplacian.

The usual decomposition of the laplacian matrix L into a diagonal matrix, D, and a weight matrix, W, leads to the formulation L=D−W. Here D is a diagonal matrix with the terms D_(ii)=#[k|iεw_(k)] at its diagonal, which represents the cardinality of the number of windows of which the pixel i is a member. The individual terms of the weight matrix W are given by

$\begin{matrix} {W_{ij} = {\sum\limits_{k❘{{({i,j})} \in w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}} & (18) \end{matrix}$

This brings the present discussion to the subject of the Bi-affinity filter. It is first noted that by the definition of the laplacian matrix, all its rows sum to zero, which leads to D_(ii)=Σ_(j)W_(ij). At the local minima, the solution α* satisfies the first order optimality condition L^(T)α*=0. So the optimal condition can be written as

$\begin{matrix} {{L^{T}\alpha^{*}} = {\left( {D - W} \right)^{T}\alpha^{*}}} & (19) \\ {{L^{T}\alpha^{*}} = \begin{pmatrix} {D_{11}\alpha_{1}^{*}} & {- {\sum\limits_{j}\;{W_{1\; j}\alpha_{j}^{*}}}} \\ {D_{22}\alpha_{2}^{*}} & {- {\sum\limits_{j}\;{W_{2\; j}\alpha_{j}^{*}}}} \\ \vdots & \vdots \\ {D_{nn}\alpha_{n}^{*}} & {- {\sum\limits_{j}\;{W_{Nj}\alpha_{j}^{*}}}} \end{pmatrix}} & (20) \end{matrix}$ Substituting D_(ii)=Σ_(j)W_(ij) into the above system of equations and invoking the first order optimality condition leads to

$\begin{matrix} {\begin{pmatrix} {\sum\limits_{j}\;{\left( {\alpha_{1}^{*} - \alpha_{j}^{*}} \right)W_{1\; j}}} \\ {\sum\limits_{j}\;{\left( {\alpha_{2}^{*} - \alpha_{j}^{*}} \right)W_{2\; j}}} \\ \vdots \\ {\sum\limits_{j}\;{\left( {\alpha_{n}^{*} - \alpha_{j}^{*}} \right)W_{nj}}} \end{pmatrix} = 0} & (21) \end{matrix}$ The effect of this equation is that the color affinity W_(ij) for two pixels with the same color (same α*), is a positive quantity varying with the homogeneity of the local windows containing the pixels i and j as governed by Eqn. 18. But for pixels with different color (different α*) the color affinity is zero. In essence the rows of the laplacian matrix L work as a zero-sum filter kernel, after appropriate resizing of the window.

This leads to the formulation of the Bi-affinity filter as:

$\begin{matrix} {{h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}} & (22) \end{matrix}$ FIG. 2 illustrates a sample setup for implementing the present Bi-affinity filter. The input image may be provided in a digital memory 110 that is in communication with a data processing device 112, such as a CPU, ASIC, PLD, CPLD, or other type of data processing or computing device. In the present case, the data processing device processing the input image in memory 110 according to processing steps defined by equation 22, where the dependence on user specified parameters σ,ε on the filter output are denoted. I and h are the input and output images respectively, x and y are pixel locations over an image grid, and Ω_(x) is the neighborhood induced around the central pixel x. In the present case, f_(s)(x,y) measures the spatial affinity between pixels at x and y, where parameter σ controls the amount of spatial blurring and serves a similar purpose as the spatial filter variance (i.e. σ_(s) in eqn. 3) in a traditional bilateral filter. L_(xy) ^(ε) is the laplacian matrix (or matting laplacian) that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color. The parameter ε, a regularization term, works analogous to the range variance parameter (i.e. σ_(r) in eqn. 4) in a traditional bilateral filter. Note that the relative weight attributed to regularization term ε, determines the smoothness of the α estimates (i.e. the smoothness of color blending from pixel to pixel), which in the present work translates to the smoothness of the filtered image. The output image h may be saved back to memory 110 or may be stored in another memory device, not shown.

However, the Bi-affinity filter does not smooth the edge, due to the affinity formulation which is zero across the edge. This is in direct contrast to the Bilateral filter, which has an inherent bending effect at the edges. This image distortion can be observed in FIGS. 3 a-3 c. FIG. 3 a is an example of an original image with a sharp image. FIG. 3 b shows the result of applying bilateral filtering. As is shown, bilateral filtering results in rounding of the edges. By contrast, applying Bi-affinity filtering preserves edge sharpness, as is shown in FIG. 3 c.

The calculation of the exact weight matrix W_(ij) (hereafter called the “affinity matrix”) as mentioned in Eqn. 18, involves evaluations over all possible overlapping windows that contain the center pixel. This involves O(w³) computations, where w is the size of the window. For example, if the window size is 3 (i.e. three pixels by three pixels), then nine computations would be required per center pixel, which is more than typically required in a traditional bilateral filter.

For example FIG. 4 a shows a center pixel C1 and nine overlapping 3×3 pixel windows 21-29 that include center pixel C1. Neighbor pixels N1-N8 are also noted. In the present example, the color affinity between center pixel C1 and target neighbor pixel N1 is desired.

The overall complexity can be reduced by evaluating the color affinity over a smaller set of possible windows. For example in FIG. 4 b, only the four overlapping windows, 22, 23, 24, and 29, that encompass both center pixel C1 and target neighbor pixel N1 may be evaluated.

In the simplest case, shown in FIG. 4 c, the terms of W_(i,j) can be evaluated locally, i.e. counting the contribution of only the center pixel's local window, 29, which is the window centered about C1. In this case, the complexity is similar to the typical bilateral range filter, which involves O(w²) computations. This simplest case of only evaluating terms of W_(i,j) local to a center pixel (i.e. only evaluating the center pixel's local window) is the basis of an approximate Bi-affinity filter.

To keep fair the following comparisons of the Bi-affinity filter with the bilateral filter (i.e. to assure that both have a similar level of complexity), the bilateral filter will instead be compared to the approximate Bi-lateral filter denoted by h^(l)(x) and herein defined as:

$\begin{matrix} {{h^{l}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\;{{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{x\; ɛ}{I(y)}}}{\sum\limits_{y \in \Omega_{x}}\;{{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{x\; ɛ}}}} & (23) \end{matrix}$ which considers only the local window centered around pixel x, denoted by L^(x).

The operations involved in computing the L_(ij) terms as mentioned in Eqn. 6 (or Eqn. 17), can be decomposed as summation of Gaussian likelihoods over window dependent parameters μ_(w), Σ_(w). These parameters can be computed by accumulating first and second order sufficient statistics over windows. If memory complexity is not an issue then pre-computing 9 integral images can be an option. These 9 integral images correspond to 3 integral images for each of the R, G and B color channels, 3 integral images for RR, GG and BB channels, and 3 integral images for RG, GB and RB channels. For 3 channel color images, this is equivalent to storing 3 more images into the memory. For really large images (HDTV etc.) this option might not be the most optimal due to the large memory overhead.

Another method of computing the L_(ij) terms is to collect sufficient statistics for the current window and then update the statistics for each unit move from top to bottom and left to right, as proposed by the median filtering approach in Two-Dimensional Digital Signal Processing II, Transforms and Median Filters (by T. S. Huang, pages 209-211, 1981, Springer-Verlag, which is herein incorporated in its entirety by reference) and in Fast Median and Bilateral Filtering (ACM Trans. Graph., 25(3):519-526, 2006, by Weiss), which is also herein incorporated in its entirety by reference. Both of these methods can now be used to implement both the full and the approximate bi-affinity filter.

The above approximate Bi-affinity filter (as described in eqn. 23) was tested.

With reference to FIG. 5, the regularization term ε in the affinity formulation works as an edge smoothness term. For understanding the effect of this term, the amount of regularization ε used for the process was varied for each of nine square windows, W5, W7, W9, W11, W13, W15, W17, W19, and W21, with each window having a size of 5, 7, 9, 11, 13, 15, 17, 19, and 21 pixels per side, respectively. The power-to-signal noise ratio, PSNR, for each window with respect to the original image was recorded. As shown, the PSNR degrades for larger window sizes. This is in keeping with the two color model, which is valid only for small windows. However, the regularization term ε neutralizes the effect of window size to a certain degree, as is evident by the band of values collecting near the PSNR value of 96 dB. This suggests a possible tradeoff between PSNR and edge smoothness. For very small regularization values, the noise across the edge can contribute to the jaggedness of the reconstruction. This effect can be countered by increasing regularization term ε, but the increased regularization comes at the cost of increased smoothness of the overall image.

Empirically, good results have been obtained for larger window sizes by keeping values of regularization term ε relatively larger than those proposed in the matting literature mentioned above (i.e. preferably at least 1.5 times larger).

With reference to FIG. 6, the effect of regularization term ε for a fixed window size are illustrated by showing the results assigning values 0.0005, 0.005, 0.05, and 0.5 to regularization term ε. The edge reconstruction becomes increasingly jagged as the amount of regularization is decreased, as is more clearly seen in the closeup view at the upper-right corner of each sample image.

With reference to FIGS. 7 a and 14 b, comparisons of the PSNR of the present Bi-affinity filter applied in an image's native RGB color space versus the bilateral filter applied in both an image's native RGB color space and converted CIELAB color space are shown. For quantitative comparisons against the traditional bilateral filter, the window size is varied for a fixed range filter variance of 0.1 in FIG. 7 a, and for a fixed range filter variance of 1.0 in FIG. 7 b.

With ε=σ_(r)=0.1 in FIG. 7 a, the PSNR values of the present Bi-affinity filter 100 is better than that of the RGB bilateral filter 102, but is lower than the CIELAB bilateral filter 104. Nonetheless, the present Bi-affinity filter 100 is still within acceptable deviations from the CIELAB bilateral filter 104.

However, as is shown in FIG. 7 b, when the range filter variance is increased to 1.0 (i.e. ε=1), the preference of the present Bi-affinity filter 100 can surpass that of the CIELAB bilateral filter 104. More specifically, when the window is increase to at least 5×5 pixels (i.e. w=5), the performance of the present Bi-affinity filter in the image's native RGB color spaces surpasses that of the bilateral filter even in the converted CIELAB color space.

As is mentioned earlier in reference to Eqn. 23, the approximate Bi-affinity filter, which is evaluated over one centered local window, closely emulates the traditional bilateral filter. The following, next experimental results compare the present approximate Bi-affinity filter to the traditional bilateral filter. These results are illustrated in FIG. 8.

With reference to FIG. 8, the upper-left corner image shows the original input image, to which the bilateral filter and the present approximate Bi-affinity filter will be applied. The results of applying the bilateral filter in the input image's native RGB color space (with the bilateral filter individually applied to each of the Red color channel, Green color channel, and Blue color channel) is shown in the upper-right corner. As shown, the resultant soften image suffers from color artifacts at sharp color edges that may be perceived as a loss in image definition.

However, if the input image is first converted from the RGB color space to the CIELAB color space, and the bilateral filter is then individually applied to each of the three CIELAB color channels, then the resultant soften image is more clearly defined, as is shown in the lower-left corner of FIG. 8.

Nonetheless, as is shown in the lower-right corner of FIG. 8, similar results are obtained by applying the present Bi-affinity filter to the original input image in its native RGB color space, with a regularization factor of ε=0.1. Thus, the response of the bilateral filter in the CIELAB color space and the Bi-affinity filter in the native RGB color space is very similar, even though the Bi-affinity filter does not need, or use, any color conversions, resulting in a reduction in filter processing steps.

Two additional sets of experimental results similar to those of FIG. 8 for comparing the Bi-affinity filter to the bilateral filter in both the native RGB color space and in the CIELAB color space are shown in FIGS. 9 and 10.

The present Bi-affinity filter also has applications in enhanced image zooming. The original Bi-affinity filter of Eqn. 22 (i.e. the version not limited to evaluation over only one centered local window), has been shown to preserve very minute details. In other words, the present bi-affinity formulation preserves very intricate details of an image as compared to the traditional bilateral filter, which preserves only dominant edges of an image. In this regard, the present Bi-affinity filter can be understood to preserve all edges, whereas the bilateral filter preserves only strong edges. This important feature of the Bi-affinity filter leads to one of the most interesting applications of the Bi-affinity filter: image enhancement and zooming.

Image enhancement techniques attempt to convert a low-resolution image to a high-resolution image. In effect, image enhancement techniques try to generate high-resolution data from available low-resolution data by estimating missing information. This leads to numerous formulations, some learning based and some interpolation based. Basically, if one can infer the missing high-resolution data, then one can add the missing high-resolution data to an interpolation of the low-resolution data (i.e. an enlarged or zoomed-in version of the input image) that satisfies data fidelity constraints. The result is an estimated (i.e. generated) high-resolution image.

For example with reference to FIG. 11 a, given a low-resolution input image, as shown, one can interpolate it to a desired higher-resolution, and then add the missing high-resolution information to generate a final high-resolution result. The mean affinity at each pixel, which is the row wise normalized summation of W, appears to contain this missing detail. This missing detail is shown in FIG. 11 b. The Bi-affinity filter further places a smoothed affinity weighted kernel at each pixel, and the combined effect is the enhanced image shown in FIG. 11 c. This enhancement effect is a bi-product of the filtering formulation of the Bi-affinity filter.

A second example of the enhancement feature of the present Bi-affinity filter is shown FIGS. 12 a and 12 b. FIG. 12 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation. FIG. 12 b shows the same zoomed-in image additionally enhanced by the present Bi-affinity filter. Note the preservation of small image details.

A third example of the enhancement feature of the present Bi-affinity filter is shown FIGS. 13 a and 13 b. FIG. 13 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation. FIG. 13 b shows the same zoomed-in image additionally enhanced by the present Bi-affinity filter.

A fourth example of the enhancement feature of the present Bi-affinity filter is shown FIGS. 14 a and 14 b. FIG. 14 a shows an image zoomed-in by a factor of 2× by bi-cubic interpolation. FIG. 14 b shows the same zoomed-in image additionally enhanced by the present Bi-affinity filter.

A graphical model for image enhancement is shown in FIGS. 15 a and 15 b. FIG. 15 a illustrates a passive filtering technique and FIG. 15 b illustrates an active filtering technique. In both FIGS. 15 a and 15 b, the circles represent observed nodes, the squares represent function nodes, and the triangles represent hidden nodes to be estimated.

The method of the present invention is a passive filtering technique, and is best described by FIG. 15 a. Like other passive filtering techniques, e.g. bilateral, bicubic, etc., the present method only looks at the low-resolution observation layer (i.e. layer having lower circles LC along the dashed grid pattern) to generate the values of the high-resolution scene (i.e. the upper layer having upper triangles UT). In the present case, the filter kernel is applied to the observed node to generate the scene.

By comparison, active methods such as Markov random field (MRF) based models, shown in FIG. 15 b, impose neighborhood continuity constraints in the inferred layer as well. That is, scene nodes are also influenced by the neighbors in the hidden scene field. This is typically done through interaction potential within the scene nodes. It is presently put forth that the principles of the present invention may be applied to a such model, as well.

In summary, a new edge preserving filter, which works on the principle of matting affinity, is proposed. The formulation of matting affinity allows a better representation of the range filter term in bilateral filter class. The definition of the affinity term can be relaxed to suit different applications.

An approximate Bi-affinity filter whose output is shown to be very similar to the traditional bilateral filter is defined. The present technique has the added advantage that no color space changes are required and hence an input images can be handled in its original color space, i.e. its native color space. This is a big benefit over traditional bilateral filter, which needs a conversion from its native color space to a perception based color spaces, such as CIELAB, to generate results close to the present invention.

Furthermore, the full Bi-affinity filter preserves very minute details of the input image, and can easily be extended to an image enhancement application to permit an enhanced zooming function.

While the invention has been described in conjunction with several specific embodiments, it is evident to those skilled in the art that many further alternatives, modifications, and variations will be apparent in light of the foregoing description. Thus, the invention described herein is intended to embrace all such alternatives, modifications, applications and variations as may fall within the spirit and scope of the appended claims. 

1. A method of processing a digital, input image, comprising: providing said digital input image, I, in physical memory, said input image having color information in a native color space for each image pixel; providing a processing device coupled to said physical memory to access said input image I and create an output image, h, by implementing the following steps: (a) applying the following sequence to said input image I, ${h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$ where x and y are pixel locations over an image grid, and Ω_(x) is the neighborhood induced around the central pixel x; f_(s)(x,y) measures the spatial affinity between pixels at x and y; parameter σ controls an amount of spatial blurring; L_(xy) ^(ε) is a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; parameter ε is a regularization term whose relative weight determines the smoothness of output image h.
 2. The method of claim 1, wherein step (a) is applied to said input image I in its native color space.
 3. The method of claim 2, wherein said native color space is the RGB color space.
 4. The method of claim 1, wherein said laplacian matrix L_(xy) ^(ε) follows a 2 color model specifying that pixel color I_(i) in an input image I can be represented as a linear combination of two colors P and S as follows: I _(i)=α_(i) P+(1−α_(i))S, ∀iεw, 0≦α_(i)≦1 where the two colors are piecewise smooth and can be derived from local properties within neighborhood Ω_(x) containing the pixel i, and wherein piecewise smooth means smooth within a region of similar in size as window w.
 5. The method of claim 4, wherein parameter ε determines the smoothness of the decomposition into said two color modes as represented by the smoothness of α estimates.
 6. The method of claim 4, wherein α_(i) is defined as: ${\alpha_{i} = {{\sum\limits_{c}\;{a^{c}I_{i}^{c}}} + b}},{\forall{i \in w}}$ where α are weights for each color channel and is constant for the window w, b is a model parameter, and c is an index over the color channels.
 7. The method of claim 6, wherein the α values are determined according to quadratic cost function J(α)=α^(T)Lα.
 8. The method of claim 7, wherein laplacian matrix L is an n×n matrix, where n is the total number of pixels in input image I whose (ij)^(th) element is given by $\sum\limits_{k❘{{({i,j})} \in w_{k}}}\;\left( {\delta_{ij} - {\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}} \right)$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the identity matrix of size 3×3.
 9. The method of claim 8, wherein the laplacian matrix L is further decomposed into a diagonal matrix D and a weight matrix W with the formulation L=D−W, wherein: diagonal matrix D is defined with terms D_(ii)=#[k|iεw_(k)] at its diagonal and represents the cardinality of the number of windows of which the pixel i is a member; individual terms of weight matrix W are given by $W_{ij} = {\sum\limits_{k❘{{({i,j})} \in w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ where δ_(ij) is the Kronecker delta, μ_(k) is a 3×1 mean vector of colors inside the k^(th) window with both i and j as members, σ_(k) is the 3×3 covariance matrix, |w_(k)| is the cardinality of the window and I₃ is the identity matrix of size 3×3; and color affinity W_(ij) for two pixels with the same color is a positive quantity varying with the homogeneity of the local windows containing the pixels i and j, and color affinity for pixels with different color is zero.
 10. The method of claim 9, wherein at the local minima, the solution α* satisfies a first order optimality condition L^(T)α*=0, and the optimal condition is defined as L^(T)α*=(D−W)^(T)α*.
 11. The method of claim 9, wherein terms of W_(i,j) are evaluated by counting the contribution of only the center pixel's local window defined as the window centered about it.
 12. The method of claim 11, wherein the local window w is a 3×3 pixel window requiring O(w²) computations.
 13. The method of claim 1, further including: (b) interpolating said input image I to a resolution higher than its native resolution; (c) combining the results of step (a) with the interpolated, higher resolution image of step (b).
 14. The method of claim 13, wherein step (a) generates detail line information of said input image I, and step (c) adds this line information to the interpolated, higher resolution image.
 15. The method of claim 1, wherein the laplacian matrix L is a zero-sum filter kernel.
 16. A method of enlarging an input image I, comprising: providing said digital input image, I, in physical memory, said input image having color information in a native color space for each image pixel; providing a processing device coupled to said physical memory to access said input image I and create an output image, h, by implementing the following steps: (a) applying the following sequence to said input image I, ${h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$ where: (i) x and y are pixel locations over an image grid; (ii) Ω_(x) is the neighborhood induced around the central pixel x; (iii) f_(s)(x,y) measures the spatial affinity between pixels at x and y; (iv) parameter σ controls an amount of spatial blurring; (v) L_(xy) ^(ε) a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; (vi) parameter ε is a regularization term whose relative weight determines the smoothness of output image h; (vii) laplacian matrix L is defined as a diagonal matrix, D, and a weight matrix, W, with the formulation L=D−W, where diagonal matrix D is further defined as D_(ii)=#[k|iεw_(k)] at its diagonal, which represents the cardinality of the number of windows of which the pixel i is a member, individual terms of the weight matrix W are given by $W_{ij} = {\sum\limits_{k❘{{({i,j})} \in w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ and weight matrix W_(ij) is evaluated over all possible overlapping windows that contain the center pixel; and (b) interpolating said input image I to a resolution higher than its native resolution; (c) combining the results of step (a) with the interpolated, higher resolution image of step (b).
 17. The method of claim 16, wherein step (b) generates detail line information of said input image I, and step (c) adds this line information to the interpolated, higher resolution image.
 18. The method of claim 16, wherein step (a) is applied to said input image I in its native color space.
 19. The method of claim 17, wherein said native color space is the RGB color space.
 20. A method of smoothing an input image I, comprising: providing said digital input image, I, in physical memory, said input image having color information in a native color space for each image pixel; providing a processing device coupled to said physical memory to access said input image I and create an output image, h, by implementing the following steps: (a) applying the following sequence to said input image I, ${h_{\sigma,ɛ}(x)} = \frac{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}{I(y)}} \right)}{\sum\limits_{y \in \Omega_{x}}\;\left( {{f_{s}^{\sigma}\left( {x,y} \right)}L_{xy}^{ɛ}} \right)}$ where: (i) x and y are pixel locations over an image grid; (ii) Ω_(x) is the neighborhood induced around the central pixel x; (iii) f_(s)(x,y) measures the spatial affinity between pixels at x and y; (iv) parameter σ controls an amount of spatial blurring; (v) L_(xy) ^(ε) a laplacian matrix that provides positive color affinity value for two examined pixels, x and y, that have the same color and provides a zero color affinity value for pixels with different color; (vi) parameter ε is a regularization term whose relative weight determines the smoothness of output image h; (vii) laplacian matrix L is defined as a diagonal matrix, D, and a weight matrix, W, with the formulation L=D−W, where diagonal matrix D is further defined as D_(ii)=#[k|iεw_(k)] at its diagonal, which represents the cardinality of the number of windows of which the pixel i is a member, individual terms of the weight matrix W are given by $W_{ij} = {\sum\limits_{k❘{{({i,j})} \in w_{k}}}{\frac{1}{w_{k}}\left( {1 + {\left( {I_{i} - \mu_{k}} \right)\left( {\sigma_{k} + {\frac{ɛ}{w_{k}}I_{3}}} \right)^{- 1}\left( {I_{j} - \mu_{k}} \right)}} \right)}}$ and terms of W_(i,j) are evaluated by counting the contribution of only the center pixel's local window defined as the window centered about it.
 21. The method of claim 20, wherein the local window w is a 3×3 pixel window.
 22. The method of claim 20, wherein regularization factor of ε is at last 0.1.
 23. The method of claim 20, wherein step (a) is applied to said input image I in its native color space.
 24. The method of claim 23, wherein said native color space is the RGB color space. 